Search CORE

10 research outputs found

A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

Author: A Gil
A Luckow
A Tiwari
B Ludäscher
BC Pierce
BP Vandervalk
C Lin
Cameron Mura
CS Soto
D Earl
D Frishman
E Bartocci
E Deelman
E Deelman
J Dean
J Eker
J Misra
J Orvis
JP Morrison
K Hinsen
K Jeffay
M Halling-Brown
Marcin Cieślik
MC Schatz
MR Berthold
MWEJ Fiers
P Liu
P Romano
P Romano
S Hoon
S Kannan
T Oinn
T Tu
U Radetzki
W Van der Aalst
WM Johnston
Z Yao
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (<it>e.g</it>., for biomolecular sequences, alignments, structures) and functionality (<it>e.g</it>., to parse/write standard file formats). Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at <url>http://muralab.org/PaPy</url>, and includes extensive documentation and annotated usage examples.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services

Author: A Labarga
AR Jones
B Wallner
BioMoby Consortium
C Steinbeck
C Steinbeck
D Smedley
E Jain
E Willighagen
Egon L Willighagen
EW Sayers
GL Holliday
H Stockinger
H Sugawara
Jarl ES Wikberg
Johannes Wagener
L Stein
LM Vaquero
M Hucka
M Lapins
MA Larkin
MD Wilkinson
MWEJ Fiers
N Adams
O Spjuth
Ola Spjuth
P Fisher
P Murray-Rust
PBT Neerincx
R Kottmann
RD Dowell
S Hoon
S Hunter
S Kaarthik
S Kerrien
S Kuhn
S Miyazaki
T Oinn
UniProt Consortium
X Dong
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP) and REpresentational State Transfer (REST) services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. Results: We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP), consisting of an extension (IO Data) to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. Conclusion: XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1) services are discoverable without the need of an external registry, 2) asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3) input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

PubMed Central

Open Access LMU

An integrative approach for building personalized gene regulatory networks for precision medicine

Author: A Battle
A Brodie
A Butler
A Dixit
A Fabregat
A Feelders
A Lachmann
A McGovern
A Ocone
A Saha
A Zeisel
AA Pai
AB Rosenberg
AJ Martins
AR Forrest
AS Weintraub
AT Specht
B Adamson
B Zhang
C Sima
C Trapnell
C Weinreb
CE Shannon
CS Greene
D Adams
D Dijk van
D Marbach
D Marbach
D Pe'er
DA Jaitin
DA Jaitin
DA Knowles
DJ Schaid
DM Camacho
DV Zhernakova
Dylan H. de Vries
EE Schadt
EH Simpson
ENCODE Project Consortium
ER Gamazon
ET Whittaker
European Union
EZ Macosko
F Buettner
F Tang
FK Hamey
G Manno La
G McVicker
GK Marinov
GTEx Consortium Laboratory, Data Analysis & Coordinating Center (LDACC)-Analysis Working Group, Statistical Methods groups-Analysis Working Group, Enhancing GTEx (eGTEx) groups, NIH Common Fund, NIH/NCI
GXY Zheng
H Ogata
H Shi
Harm Brugge
Harm-Jan Westra
HC Stoeklé
HJ Westra
HM Kang
J Cao
J Marchini
J Menche
J Menche
J Vivian
J Yang
JD Buenrostro
JD Finkle
JD Welch
JF Degner
JK Pickrell
JR Li
JS Desai
JT Bell
KJ Karczewski
L Collado-Torres
L Gao
LM Sollid
Lude Franke
M Claussnitzer
M Fagny
M Kasowski
M Sanchez-Castillo
M Stoeckius
MGP Wijst van der
MJ Favé
ML Whitfield
Monique G. P. van der Wijst
MWEJ Fiers
N Silvester
NJ Schork
P Datlinger
PM Visscher
Q Peng
Q Wang
R Leinonen
RE Consortium
S Aibar
S Chatterjee
S Chen
S Dam van
S Ghazanfar
S Islam
S Rashid
S Raychaudhuri
S Smemo
S Tasaki
TE Bartlett
TE Chan
TH Pers
U Herbach
U Shim
V Svensson
VG Cheung
W Liao
W Saelens
WA Schmitt Jr
YX Wang
Z Bar-Joseph
Z Ji
Z Zhu
Publication venue
Publication date: 01/12/2018
Field of study

Only a small fraction of patients respond to the drug prescribed to treat their disease, which means that most are at risk of unnecessary exposure to side effects through ineffective drugs. This inter-individual variation in drug response is driven by differences in gene interactions caused by each patient's genetic background, environmental exposures, and the proportions of specific cell types involved in disease. These gene interactions can now be captured by building gene regulatory networks, by taking advantage of RNA velocity (the time derivative of the gene expression state), the ability to study hundreds of thousands of cells simultaneously, and the falling price of single-cell sequencing. Here, we propose an integrative approach that leverages these recent advances in single-cell data with the sensitivity of bulk data to enable the reconstruction of personalized, cell-type- and context-specific gene regulatory networks. We expect this approach will allow the prioritization of key driver genes for specific diseases and will provide knowledge that opens new avenues towards improved personalized healthcare

Proceedings - University of Groningen

Crossref

University of Groningen

ARTS repository - University of Groningen

Directory of Open Access Journals

Dissertations of the University of Groningen

In Silico Prediction of Allergenic Proteins

Author: A Martinez Barrio
MB Stadler
MWEJ Fiers
O Ivanciuc
S Saha
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

COMBat : visualizing co-occurrence of annotation terms

Author: Brakel RBJ Remko van
Fiers MWEJ
Francke C
Westenberg MA Michel
Wetering HMM Huub van de
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

We propose a visual analysis approach that employs a matrix-based visualization technique to explore relations between annotation terms in biological data sets. Our flexible framework provides various ways to form combinations of data elements, which results in a co-occurrence matrix. Each cell in this matrix stores a list of items associated with the combination of the corresponding row and column element. By re-arranging the rows and columns of this matrix, and color-coding the cell contents, patterns become visible. Our prototype tool COMBat allows users to construct a new matrix on the fly by selecting subsets of items of interest, or filtering out uninteresting ones, and it provides variousadditional interaction techniques. We illustrate our approach with a few case studies concerning the identification of functional links between the presence of particular genes or genomic sequences and particular cellular processes

Repository TU/e

Crossref

Pure OAI Repository

Case study : Visualization of annotated DNA sequences

Author: Fiers MWEJ
Nap JP
Peeters THJM Tim
Wetering HMM Huub van de
Wijk JJ Jarke = Jack van
Publication venue: Eurographics Association
Publication date: 01/01/2004
Field of study

DNA sequences and their annotations form ever expanding data sets. Proper explorations of such data sets require new tools for visualization and analysis. In this case study, we have defined the requirements for a visualization tool for annotated DNA sequences.We have implemented these requirements in a new and flexible tool for browsing and comparing annotated DNA sequences interactively and in real-time. The use of standard information visualization techniques, such as linked windows, perspective walls, and smooth interaction, enables genome researchers to obtain better insight in large DNA data sets in an effective, efficient, and attractive way

Repository TU/e

Pure OAI Repository

Complete genome sequence of Lactobacillus plantarum WCFS1

Author: Boekhorst J.
Bron P.A.
de Vries M
Fiers MWEJ
Groot MNN
Hoffer S.M.
Kerkhoven R.
Kleerebezem M.
Klein-Lankhorst R.M.
Kranenburg R.
Kuipers O.P.
Leer R
Molenaar D
Peters SA
Sandbrink HM
Siezen R.J.
Stiekema W.J.
Tarchini R
Ursing B
Vos W.M.
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2003
Field of study

Item does not contain fulltex

VU Research Portal

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

Wageningen University & Research Publications

Radboud Repository

University of Groningen Digital Archive

Dissertations of the University of Groningen

A community-based transcriptomics classification and nomenclature of neocortical cell types

Author: A Fairén
A Paul
A Regev
A Tsiola
A Zeisel
AB Rosenberg
AK Shalek
Alok Nath
AM Klein
AM Masci
Andreas Savas Tolias
Angelica Foggetti
Ann Clemens
AP Alivisatos
AR Woodruff
Argel Aguilar-Valles
B Cauli
B Cauli
B Mihaljević
B Smith
B Tasic
B Tasic
Boudewijn Lelieveldt
C Mayer
C Roselli
C Trapnell
Christiaan P. J. de Kock
Christian Wozny
CI Bargmann
Concha Bielza
CR Cadwell
CR Gerfen
D Arendt
D Arendt
D Dumitriu
D Mi
D Osumi-Sutherland
Detlev Arendt
DH Hubel
Dirk Feldmeyer
DP Pelvig
DW Wheeler
E Boldog
E Lein
EC Bush
Ed Lein
EM Martersteck
Emma Louise Louth
Eric S. Kuebler
Esther Serrano-Saiz
EZ Macosko
FC Mansergh
G Fishell
G Fishell
GA Ascoli
Giorgio A. Ascoli
GM Shepherd
Gordon James Fishell
GX Zheng
Gábor Tamás
H Markram
H Markram
H Zeng
Hajime Hirase
Henner Koch
Hermany Munguba Vieira
Homeira Moradi Chameh
Hongkui Zeng
Huibert D. Mansvelder
Irina Bystron
J Bard
J DeFelipe
J Kozloski
J Szentágothai
J Winnubst
Jan H. Lui
Javier DeFelipe
Jean Rossier
Jens Hjerling-Leffler
JG White
Jochen F. Steiger
Josh Huang
Joshua R. Sanes
JR Ecker
JS Lund
Juan Yuan
Julio Martinez-Trujillo
K Shekhar
K Sugino
K Tang
KD Harris
Keagan Dunville
Kenta Hagihara
KH Chen
Konstantin Khodosevich
L Guerra
L Liu
L van der Maaten
Liu Yong
LM McGarry
LS Krimer
M Crow
M Fu
M Garber
M He
M Häring
MA Tosches
MA Tosches
Maiken Nedergaard
Malte Kühnemund
Marco Capogna
Maria Antonietta Tosches
Markus M. Hilscher
Michael Hawrylycz
Miguel Turrero García
MJ Hawrylycz
Moritz Helmstaedter
MWEJ Fiers
N Habib
N Kessaris
Nadia Aalling
Natalia A. Goriounova
Netanel Ofer
Ole Kiehn
Onur Güntürkün
Oscar Marin
P Pereira
P Somogyi
Parviz Ghaderi
Pavel Němec
Pedro Larrañaga
Peter Somogyi
R Armañanzas
R Benavides-Piccione
R Durruthy-Durruthy
R Lorente de Nó
R Santana
R Yuste
Rafael Yuste
Rafiq Huda
RD Hodge
Rebecca Hodge
Richard Scheuermann
Richárd Fiáth
RJ Douglas
Ruben Armananzas Arnedillo
S Ramón y Cajal
S Siebert
SA Anderson
Samuel Pontes
Sandra Esmeralda Dos Santos
SB Nelson
SF Altschul
SJB Butt
Suzana Herculano
T Bakken
T Stuart
TD Yager
TE Bakken
Thomas V. Wuttke
TJ Nowakowski
Tobias Borgtoft Bergmann
TS Andrews
U Sümbül
Ulrich Gottfried Pfisterer
Vahid Bokharaie
Vanessa Jane Hall
VY Kiselev
William Redmond
X Jiang
Xuefan Gao
Y Kawaguchi
Yoonjeung Chang
YR Peng
ZJ Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2019
Field of study

To understand the function of cortical circuits it is necessary to classify their underlying cellular diversity. Traditional attempts based on comparing anatomical or physiological features of neurons and glia, while productive, have not resulted in a unified taxonomy of neural cell types. The recent development of single-cell transcriptomics has enabled, for the first time, systematic high-throughput profiling of large numbers of cortical cells and the generation of datasets that hold the promise of being complete, accurate and permanent. Statistical analyses of these data have revealed the existence of clear clusters, many of which correspond to cell types defined by traditional criteria, and which are conserved across cortical areas and species. To capitalize on these innovations and advance the field, we, the Copenhagen Convention Group, propose the community adopts a transcriptome-based taxonomy of the cell types in the adult mammalian neocortex. This core classification should be ontological, hierarchical and use a standardized nomenclature. It should be configured to flexibly incorporate new data from multiple approaches, developmental stages and a growing number of species, enabling improvement and revision of the classification. This community-based strategy could serve as a common foundation for future detailed analysis and reverse engineering of cortical circuits and serve as an example for cell type classification in other parts of the nervous system and other organs

VU Research Portal

Cold Spring Harbor Laboratory Institutional Repository

Edinburgh Research Explorer

Leiden University Scholary Publications

Digital.CSIC

MPG.PuRe

Archivo Digital UPM

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Copenhagen University Research Information System

PubMed Central